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Abstract — The eigenvalues of matrices representing the struc- 
ture of large-scale complex networks present a wide range of 
applications, from the analysis of dynamical processes taking 
place in the network to spectral techniques aiming to rank the 
importance of nodes in the network. A common approach to 
study the relationship between the structure of a network and 
its eigenvalues is to use synthetic random networks in which 
structural properties of interest, such as degree distributions, are 
prescribed. Although very common, synthetic models present two 
major flaws: (i) These models are only suitable to study a very 
limited range of structural properties, and iii) they implicitly 
induce structural properties that are not directly controlled and 
can deceivingly influence the network eigenvalue spectrum. In 
this paper, we propose an alternative approach to overcome 
these limitations. Our approach is not based on synthetic models, 
instead, we use algebraic graph theory and convex optimization 
to study how structural properties influence the spectrum of 
eigenvalues of the network. Using our approach, we can compute 
with low computational overhead global spectral properties of a 
network from its local structural properties. We illustrate our 
approach by studying how structural properties of online social 
networks influence their eigenvalue spectra. 



I. Introduction 

During the last decade, the complex structure of many large- 
scale networked systems has attracted the attention of the 
scientific community Q. The availability of massive databases 
describing these networks allows researchers to explore their 
structural properties with great detail. Statistical analysis of 
empirical data has unveiled the existence of multiple common 
patterns in a large variety of network properties, such as 
power-law degree distributions Q, or the small-world phe- 
nomenon [3]. Aiming to replicate these structural patterns, 
a variety of synthetic network models has been proposed in 
the literature, such as the classical Erdos-Renyi random graph 
(and its generalizations) (H, GL the preferential attachment 
model proposed by Barabasi and Albert |2], or the small- world 
network proposed by Watts and Strogatz 0. 

Synthetic network models have been widely used to analyze 
the performance of dynamical processes on a network. In 
this direction, a fundamental question is to understand the 
impact of a particular structural property in the performance 
of the network (51 . The most common approach to address 
this question is to use synthetic network models in which 
one can prescribe the structural property under study. The 
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impact of structural features, such as degree distributions (H, 
or clustering Q, has been widely studied in the literature 
following this methodology. Although very common in the 
literature, this approach presents two major flaws: 

1) Synthetic network models are only suitable to study 
a very limited range of structural properties. For ex- 
ample, synthetic random networks presenting structural 
properties beyond simple degree distributions become 
intractable from a spectral point of view. 

2) Synthetic network models implicitly induce many struc- 
tural properties that are not directly controlled and 
can be relevant to the network dynamical performance. 
Therefore, it is difficult to isolate the role of a particular 
structural property using synthetic network models. 

Since a network's eigenvalues influence the dynamical 
behavior of dynamical processes that can take place in the 
network l7l- |[TT1l . it is of interest to study the relationship 
between the structural properties of the network, such as 
the distribution of degrees, triangles and other substructures, 
and its eigenvalue spectrum. In this paper, we propose a 
novel framework, based on spectral and algebraic graph theory 
and convex optimization, to compute with low computational 
overhead global spectral properties of a network from its 
local structural properties. In particular, we derive optimal 
bounds and estimators of spectral properties of interest from 
structural information. Our results are useful to unveil the set 
of structural properties that have the highest impact in the 
eigenvalue spectrum of a network. In particular, in the case of 
online social networks, we find that the correlation between 
the distribution of degrees and triangles in the network plays 
a key role in the spectral radius. 

The rest of this paper is organized as follows. In the next 
section, we review graph-theoretical terminology needed in our 
derivations. We also review existing bounds and estimators of 
spectral properties of a network in terms of structural prop- 
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erties. In Section [III] we use algebraic graph theory to derive 
closed-form expressions for the so-called spectral moments 
of a network. In Section [IV] we use convex optimization to 
derive optimal bounds on spectral properties of interest from 
these moments. We numerically verify the performance of our 
bounds using real network data in Section [V] where we also 
use our results to unveil the set of structural properties with 
the highest influence on the spectral radius of social networks. 

II. Notation & Preliminaries 

Let Q = (V, £ ) denote an undirected graph with n nodes, e 
edges, and no self-loop^] We denote by V (Q) = {^i, . . . , v n } 

1 An undirected graph with no self-loops is also called a simple graph. 
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the set of nodes and by £ (Q) C V (G) x V (0) the set of 
undirected edges of Q. If {vi,Vj} G £ (<?) we call nodes vi 
and adjacent (or neighbors), which we denote by V{ ~ 
Vj. We define a wa//: of length fc from vq to to be an 
ordered sequence of nodes (vq, v\, such that ^ ~ 

for i = 0, 1, fc — 1. If i>o = then the walk is closed. A 
closed walk with no repeated nodes (with the exception of the 
first and last nodes) is called a cycle. For example, triangles, 
quadrangles and pentagons are cycles of length three, four, 
and five, respectively. 

Graphs can be algebraically represented via matrices. The 
adjacency matrix of an undirected graph Q, denoted by 
Aq = [ctij], is an n x n symmetric matrix defined entry- 
wise as dij = 1 if nodes vi and Vj are adjacent, and 
dij = otherwis^] The eigenvalues of Ag, denoted by 
Ai > A2 > . . . > A n , play a key role in our paper. The spectral 
radius of Ag, denoted by p (Ag), is the maximum among the 
magnitudes of its eigenvalues. Since Ag is a symmetric matrix 
with nonnegative entries, all its eigenvalues are real and the 
spectral radius is equal to the largest eigenvalue, Ai. We define 
the k-th spectral moment of the adjacency matrix Ag as 



m k (Ag) = 



1 



(1) 



As we shall show in Section [TTTJ there is a direct connection 
between the spectral moments and the presence of certain 
substructures in the graph, such as cycles of length k. 

We define the set of neighbors of v as J\f v = {w G 
V (Q) : {v,w} G £ (G)}> The number of neighbors of v is 
called the degree of node v, denoted by d v . We can define 
several local neighborhoods around a node v based on the 
concept of distance. Let d(v,w) denote the distance between 
two nodes v and w (i.e., the minimum length of a walk 
from v to w). We say that v and w are k-hop neighbors if 
d(v,w) = fc, and define the k-th order neighborhood of v as 
A/; (fc) = {w G V (Q) : d (v, w) < k}. The set of nodes in N^ k) 
induces a subgraph C Q, with node-set and edge- 
set Sy^ Q\ £ (G) defined as the subset of edges connecting 
nodes in 

A. Estimators of the Spectral Radius 

Random network models are currently the primary tool to 
study the relationship between the structure and dynamics of 
complex networks (6). Although many random networks have 
been proposed to analyze structural properties such as the 
degree distribution (U, or clustering Q, only random networks 
including a very limited amount of structural information are 
currently amenable to spectral analysis. 

In the original Erdos-Renyi random graph with n nodes, 
denoted by G(n,p), each edge is independently chosen with 
a fixed probability p, lfT2l . In this model, all the nodes present 
the same expected degree, E[d$] = np, and the largest eigen- 
value of its adjacency matrix is almost surely [1 + o(l)] np 
(assuming that np — fi(logn)). Although very interesting 
from a theoretical point of view, the original random graph 

2 For simple graphs, an = for all i. 



presents very limited modeling capabilities, since the degree 
distributions of real-world networks are almost never uniform. 

In order to increase the modeling abilities of random graphs, 
several models have been proposed in the literature. For 
example, given a sequence w = (wi, ...,w n ), Chung and Lu 
proposed in fT3ll a random graph G (w) with an expected 
sequence of degrees equal to w. In this random graph, edges 
are independently assigned to each pair of vertices with 
probability WiWj/ J2k=i Wk - Chung et al. proved in ifTH that 

^ Y^=i w i I ^j=i w j > \/ max {^} l°g n > tnen tne largest 
eigenvalue Ai (G (w)) converges almost surely 



Ai (G(w)) a 4- [l + o(l)] 



En 
i=ij 

En 
.7 = 1 ' 



(2) 



for large n. Despite its theoretical interest, random graphs with 
a given degree distribution are by far not enough to faithfully 
model the structure of real complex networks. 

Although random graph models with more elaborated struc- 
tural properties, such as clustering or hierarchy, can be found 
in the literature, these models are usually extremely chal- 
lenging (if not impossible) to analyze from a spectral point 
of view. The source of this intractability is the presence 
of strong correlations among the entries of the (random) 
adjacency matrices associated with these models. In Section 



[III] and [Ty] we introduce an alternative method to analyze the 
effect of elaborated structural properties, such as clustering 
and correlations, on the eigenvalues of a network without the 
use of intractable random graphs models. 

B. Bounds on the Spectral Radius 

In this subsection, we review some existing bounds relating 
structural features of a network, such as the degree distribution, 
with the spectral radius of the network. We can find in the 
literature several bounds on the spectral radius that are not 
based on random models. For example, for a graph Q with n 
nodes and e edges, we have the following upper bounds for 
the spectral radius 1 15 ]: 

ui \/2e - (n - 1) d min + (d min - 1) d ma x, 

u 2 = max I y/dirrij, (iJ)eE^, 



are the minimum and maximum degrees 
of Q, and ra$ = J- ^Zj e j^f. dj. Notice that none of the above 
bounds take into account the presence of triangles, or other 
cycles, in the graph. Since many real- world networks present 
a high density of cycles (i.e., social graphs), these bounds 
perform poorly in many real applications. In the following 
sections, we propose a methodology to derive bounds on 
spectral properties of relevance in terms of a wide variety of 
structural features, including the distribution of cycles in the 
network. 

III. Moment-Based Analysis of the Adjacency 
Spectrum 

Algebraic graph theory provides us with tools to relate 
the eigenvalues of a network with its structural properties. 
Particularly useful is the following result relating the k-th 
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spectral moment of Ag with the number of closed walks of 
length k in Q fT6ll : 

Lemma 3.1: Let Q be a simple graph. The k-th spectral 
moment of the adjacency matrix of Q can be written as 



1 71 1 
{ A g )= 1 -T\1= l - 



2=1 



(fc) 



(3) 



where is the set of all closed walks of length k in Q. [^] 

In the following subsections, we build on Lemma |3.1| to 
compute the spectral moments of a network in terms of 
relevant structural features. 



A. Low -Order Spectral Moments 

From ([3]), we can easily compute the first three moments 
of Ag in terms of the distribution of degrees and triangles as 
follows flgl : 

Corollary 3.2: Let Q be a simple graph with adjacency 
matrix Ag. Denote by d{ and U the number of edges and 
triangles touching node i G V (G), respectively. Then, 



m 1 (A g )=0, 

m 2 (Ag) = - V d u 
n 

ieV(G) 

(Ag) = i x: 2 ti 



(4) 



ra 3 



Proof: Since there are no self-loops in a simple graph, 
we have that m\(Ag) = 0. In order to compute rri2(Ag), 
we need to count the number of closed walks of length two 
starting at a node i. The number of walks of this type is equal 
to di. Summing over all possible starting nodes we obtain 
Y^iev(G) di- The third moment is proportional to the number 
of closed walks of length 3. Starting at node i, there are 2^ 
walks of this type, where the coefficient 2 accounts for the 
two possible directions one can walk each triangle. Summing 
over all possible starting points, we obtain J2ieV(g) ^* B 

These moments can also be expressed in terms of the total 
number of edges and triangles in Q, which we denote by e and 
A, respectively. Since e = \ J2% di and A = | J2i U DSL we 
have that: 



mi(Ag) = 0, 
m 2 (A g ) = 2e/n, 
m 3 (Ag) = 6A/n. 



(5) 



where the coefficients 2 (resp. 6) in the above expressions 
corresponds to the number of closed walks of length 2 (resp. 
3) enabled by the presence of an edge (resp. triangle). The 
computation of higher-order moments requires a more elabo- 
rated combinatorial analysis. We include details for the fourth 
and fifth spectral moments in the following subsections. 

3 We denote by \Z\ the cardinality of a set Z. 



B. Fourth- Order Spectral Moments 

A combinatorial analysis of ^ for k = 4 gives us the 
following result: 

Lemma 3.3: Let Q be a simple graph with adjacency matrix 
Ag . Denote by qi and di the number of quadrangles and edges 
touching node i G V (G), respectively. Then, 



ra 4 (Ag) 



n 

ieV(Q) 



do 



di 



(6) 



Proof: We compute the fourth moment from ^ by 
counting the number of closed walks of length 4 in Q. In 
Fig. 1, we enumerate all the possible types of closed walks 
of length 4. We can count the number of closed walks of 
each particular type in terms of network structural features as 
follows: 



(a) 



(b) 



(c) 



The number of closed walks of type (a) is equal 
to twice the number of quadrangles, where the co- 
efficient 2 in accounts for the two possible 
directions (clockwise and counterclockwise) one can 
walk each quadrangle. 

The number of walks of type (b) starting at node 
i is equal to 2(^). The expression for comes 
from summing over all possible starting points, i = 
l,...,n. 

The number of closed walks of this type can also be 
written in terms of the degrees as: 



w 



(c) 



= ai i ( d i ~ = Yl ( d i _ x ) d o • 



(d) The number of closed walks of this type starting at 
node i is equal to di, thus, = J27=i ^i- 

Hence, we obtain ^ by summing up all the above con- 
tributions, + ^4 + + and simple algebraic 
manipulations). ■ 



Lemma [33] provides an expression to compute the fourth 
spectral moment in terms of structural features, namely, the 
distribution of degrees and quadrangles. We illustrate Lemma 
3.3 in the following example. 



Example 3.1: Consider the n-ring graph, R n (without self- 
loops). The eigenvalues of the adjacency matrix of the ring 
graph are A; (A Rn ) = 2cosi^, for i = 0,1,..., n - 
1. Hence, the fourth moment is equal to (Ar u ) = 

££?=o T( 2cos ^) . which ( after 

some computations) can 
be found to be equal to 6 for n {2,4}. We can reach this 
same result by directly applying ([6]), without performing an 
eigenvalue decomposition, as follows. In the ring graph, we 
have that di = 2 and ^ = 0, for n {2,4}. Hence, from ([6]), 
we directly obtain m±(An n ) = 6, for n {2,4}. 

The fourth spectral moment can be rewritten in terms of 
aggregated quantities, such as the total number of quadrangles 
and edges, and the sum-of- squares of the degrees, as follows: 





(a) 



E 2 ^ 



(&) 



(a) 



= E 2 

1=1 

(b) 



di 



(c) 




= ^(di-l)di 

i=l 

(c) 



(d) 



(d) 



Fig. 1. Enumeration of the possible types of closed walks of length 4 in a graph with no self-loops. The classification is based on the structure of the 
subgraph underlying each closed walk. For each walk type, we also include an expression that corresponds to the number of closed walks of that particular 
type in terms of network structural properties. 



Corollary 3.4: Let Q be a simple graph. Denote by e and Q 
the total number of edges and quadrangle in Q, respectively, 
and define W2 = J27=i $ • Then, 

m 4 (Ag) = i [8Q + 2W 2 - 2e] . (7) 

Proof: The proof comes straightforward from ^ by 
substituting J27=i = anc * S?=i di = 2e. ■ 

Hence, we do not need to have access to the detailed 
distribution of quadrangles and degrees in Q to compute 
the fourth moment, we only need to know the aggregated 
quantities Q, W2, and e. 

C. Fifth- Order Moment 

Lemma 3.5: Let Q be a simple graph. Denote by Pi,U, and 
di the number of pentagons, triangles, and edges touching node 
i £ V (G), respectively. Then, 

1 



ra 5 (Ag) = - V 2pi + lOUdi 



10ti 



(8) 



Proof: The proof follows the same structure as that of 



Lemma 373) A graphical representation of the types of closed 
walks of length 5 is provided in Fig. 2. Details regarding the 
counting of closed walks of each particular type can be found 
in the Appendix. ■ 

Lemma |3.5| expresses the fifth spectral moment of Ag in 
terms of network structural features. We can rewrite ([5]) in 
terms of aggregated quantities as follows: 

Corollary 3.6: Let G be a simple graph. Denote by A and II 
the total number of triangles and pentagons in G, respectively. 
Define the degree-triangle correlation as Cat = J2i diU- Then, 

m 5 (Ag) = i [10n + 10C dt - 30A] . (9) 

Proof: The proof comes from ^ taking into account that 
££=1 V^ = 511 and ElLi U = 3A - ■ 
Observe how, as we increase the order of the moments, 
more complicated structural features appear in the expressions. 
In particular, the sum and sum-of-squares of the degrees 
influence the second and fourth spectral moments. (Notice 



that we can expand ( d j) = \ (d? - di) in ^). The total 
number of triangles in the network, A, influences the third 
and fifth moments in §5§ and ([9]). Also, the correlation between 
degree and triangle distributions, quantified by Cdt = diU, 
influences the fifth spectral moment in ([9]). We shall show in 
Section [V] that this structural correlation strongly influences 
the spectral radius of online social networks. 

The main advantage of our results may not be apparent in 
networks with simple, regular structure. For these networks, an 
explicit eigenvalue decomposition is usually easy to compute 
and there may be no need to look for alternative ways to 
compute spectral properties. On the other hand, in the case of 
large-scale complex networks, the structure of the network can 
be very intricate — in many cases not even known exactly — 
and an explicit eigenvalue decomposition can be very chal- 
lenging to compute, if not impossible. It is in these cases 
when the alternative approach proposed in this paper is most 
useful. In the following subsection, we use our expressions 
to compute the spectral moments of an online social network 
from empirical structural data. 

D. Spectral Moments of an Online Social Network 

The real network under study is a subgraph of Facebook 
with 2,404 nodes and 22,786 edges obtained from crawling 
the graph in a breadth-first search around a particular node (the 
dataset can be found in ifTTl ). Although the approach proposed 
in this section is meant to be used for much larger networks, 
we illustrate our results with this medium-size subgraph in 
order to compare our analysis with the results obtained from 
an explicit eigenvalue decomposition of the complete network 
topology. 

In this example, we first compute the structural metrics 
involved in the first five spectral moments, in particular, the 
degree di, the number of triangles ti, quadrangles q i9 and 
pentagons pi touching each node i eV. The degree di and the 
number of triangles U touching node i can be easily computed 
by counting the number of edges attached to node i and 
the number of edges connecting friends of i, respectively. In 
order to count the number of quadrangles qi and pentagons 
Pi touching each node i, we must know the structure of the 
network around node i with a radius of 2, i.e., node i needs 
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(a) 



'A* 






(b) 



(c) 



(d) 



(e) 




.,(/) 



E 2 ^ 

(f) 



Fig. 2. Possible types of closed walks of length 5 in a simple graph. The classification is based on the structure of the subgraph underlying the closed 
walks. For each walk type, we also include an expression that corresponds to the number of closed walks of that particular type in terms of network structural 
features. 



to know who her friends' friends are. We denote by \Ni^\ 
the number of nodes in the two-hops neighborhood around 
node i (excluding node i). In order to count the number of 
quadrangles (resp. pentagons) touching node i, we must verify 



the presence of a cycle for each one of the ( 



\Ni,2\\ 



(resp. 



(' |' 2 ')) subsets of three (resp. four) nodes in 7V^ 2 . 

In Fig. 3, we plot the distributions of degrees and triangles, 
as well as a scatter plot of U versus di (where each point 
has coordinates (di,ti), in log-log scale, for all i £ V (£?)). 
We then aggregate, via simple averaging, those structural 
metrics that are relevant to compute the spectral moments. In 
particular, we obtain the following numerical values for these 
metrics: 



e/n = 52di/2n = 9.478, 
Q/n = E^i/4n = 825.3, 
W 2 /n = Y,d 2 Jn = 1,318, 



A/n = ^ti/3n = 28.15, 
U/n = 5>i/5ra = 31,794, 
C dt /n = s £d i t i /n = 8,520. 



Hence, using these values in expressions ([4]), ([7]), and ([9]), 
we obtain the following spectral moments: mi (Ag) = 0, 
m 2 (Ag) = 18.95, m 3 (Ag) = 168.9, ra 4 (Ag) = 9,230, 
and ra 5 (Ag) = 402,310. 

In this section, we have derived expressions to compute 
the first five spectral moment of Ag from network structural 
properties. In the next section, we use semidefinite program- 
ming to extract bounds on spectral properties of interest from 
a sequence of spectral moments. 



IV. Optimal Spectral Bounds from Spectral 
Moments 

In this section, we introduce an approach to derive bounds 
on a network spectral properties from its sequence of spectral 
moments. Since we have expressions for the spectral moments 
in terms of structural properties, these bounds relate the 
eigenvalues of a network with its structural properties. For 
this purpose, we adapt an optimization framework proposed 
in fT8l and fT9l to derive optimal probabilistic bounds on a 
random variable from a sequence of moments of its probability 
distribution. In order to use this framework, we first need to 
introduce a probabilistic interpretation of a network eigenvalue 
spectrum and its spectral moments. 



For a simple graph Q, we define its spectral density as, 

1 n 

Vg(x) = -Y / 5(x-\ i ), (10) 
n z — ' 

i=i 

where 5 (•) is the Dirac delta function and {\i}™ =1 is the set of 
(real) eigenvalues of the symmetric adjacency matrix Ag. Let 
us define a random variable X with probability density fig. 
The moments of X ~ fig are equal to the spectral moments 
of Ag, i.e., 



E 



fig 



(X k ) = [ X k flg(x)dx 
JR 

I n r 

= Y / x k S(x - XAdx 
1 n 

= ~yZ X i = m k (Ag), 



for all k > 0. Furthermore, for a given Borel measurable set 
T, we have 



Pr(XGT)= [ tig 



(x)dx = - : Xi £ T}\ . 
n 



In other words, the probability of the random variable X being 
in a set T is proportional to the number of eigenvalues of Ag 
in T. 

In this probabilistic context, we can study two problems 
that are relevant for the network dynamical behavior. Given a 
truncated sequence of spectral moments, we formulate these 
problems as follows: 

Problem 1: Find optimal bounds on the number of eigen- 
values that can lie in a given interval T. 

Problem 2: Find bounds on the smallest and largest eigen- 
values of Ag. 

In the following subsections, we provide solutions to each 
one of the above problems, from only the knowledge of a 
truncated sequence of moments. 

A. Solution to Problem^ 

Our solution is based on the following classical problem in 
analysis: 
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Fig. 3. In the left and center figures, we plot the distributions of degrees and triangles of the social network under study (in log-log scale). In the right 
figure, we include a scatter plot where each point has coordinates (di, ti), in log-log scale, for all the nodes in the social graph. 



Problem 3 (Moment Problem): Given a sequence of mo- 
ments (mo, ...,rafc), and Borel measurable sets T C ft C R, 
compute: 



Zp = maXju J T 1 dji 

s.t. f Q xi dfi = rrij, for j = 0, 1, k. 

(11) 

where /i G M (O), M (O) being the set of positive Borel 
measures supported by ft. 

The solution to this problem provides an extension to the 
classical Markov and Chebyshev's inequalities in probability 
theory when moments of order greater than 2 are available. 
In fT8l and fT9lL it was shown that the optimal value of Zp 
can be efficiently computed by solving a single semidefinite 
program using a dual formulation. Before we introduce this 
dual formulation, it is important to discuss some details 
regarding the feasibility of this problem. 

A sequence of moments = (mo, mi, m k ) is said 
to be feasible in ft if there exists a measure a G M(O) 
whose moments match those in the sequence m/ejjln general, 
an arbitrary sequence of numbers may not correspond to 
a feasible sequence of moments. The problem of deciding 
whether or not a sequence of numbers is a feasible sequence 
of moments is called the classical moment problem |20l . 
Depending on the set ft, we find three important instances 
of this problem: 

(/) the Hamburguer moment problem, when ft = R, 

(if) the Stieltjes moment problem, when ft = R + , and 

(Hi) the Hausdorff moment problem, when ft = [0,1]. 

For univariate distributions, necessary and sufficient condi- 
tions for feasibility of these instances of the classical moment 
problem can be given in terms of certain matrices being 
positive semidefinite, as follows. Let us define, for any s > 0, 



the following Hankel matrices of moments, 



R2 



m 
mi 

m* 



mi 
m 2 



R 



2s+l 



m s+ i • 

mi 7712 

m 2 m 3 
m s+ i m s+2 



m s 

m 2s 

• rn s+1 

• rn s+2 



m 2s +i 



(12) 



Then, we have the following feasibility results for the Ham- 
burguer moment problem | 20 ]j^] 

Theorem 4.1: A necessary and sufficient condition for a 
sequence of moments = (mo, mi, ...,m,2 S ) to be feasible 
in Q = R is R 2s h 0. 

Notice that the Hankel matrix associated to the sequence of 
spectral moments (1, mi (Ag) , m2 S (Ag)) of a finite graph 
Q always satisfy Hamburguer feasibility condition, R 2s h 0. 
We now describe the dual formulation proposed in lITSl 
and |fT9l to compute the solution of the infinite-dimensional 
optimization problem in ( 1 1 ) by solving a single semidefinite 
program. 

Using duality theory, one can associate a dual variable yi to 
each equality constraint of the primal ( [TT] ) to obtain (see fT9l 
for more details): 



Z D = 



s.t. 



i=0 Vi m i 



1 > 0, for x G T, (13) 
for x G ft. 



Notice that the dual constrains are univariate polynomials in 
x. Since a univariate polynomial is nonnegative if and only 
if it can be written as sum of squares of polynomials, the 
dual problem can be formulated as a sum-of-squares program 
(SOSP) that can be numerically solved via semidefinite pro- 
gramming. (For more details on SOSP and SDP, the interested 
reader is referred to [21] and [22 j.) Karlin and Isii proved the 
following result concerning strong duality l23ll : 



4 In what follows, we assume that our measures are densities, hence mo = 

1. 



5 Feasibility conditions for the Stieltjes and Hausdorff moment problems 
can also be found in l20l . but they are not relevant in this paper. 
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Z D (a) 
F(a) 




Fig. 4. Numerical solution of the SOSP described in Example |4.1| The 
stairs-like function corresponds to the cumulative density function of fig, 
F(a). Observe how the (numerical) function (a) is greater (or equal) to 
F(a) for all values of a. 



Theorem 4.2: If the Hankel matrix of moments R 2s defined 
in §¥2\ is positive definite, then Zp — Zry. 



From Theorem 4.1 we have that a sequence of spectral mo- 
ments satisfy strong duality if det R 2s ^ 0. This determinant 
is zero only for very degenerate networks, and we assume 
strong duality holds for the networks studied in this paper. 

The optimization framework we have described above can 
be used to solve Problem [T] since the solution of the dual 
problem in (13), when rrij = rrij (Ag) for j = 1,2, 
satisfies: 

Z D > [ ldfig = -\{\ i :\ i eT}\, (14) 



where jig is the spectral density of Q. Then, Z& is the optimal 
upper bound on the number of eigenvalues of Ag that lie in 
the set T given a truncated sequence of spectral moments. We 
illustrate this result in the following example. 

Example 4.1: Let us consider the spectral distribution 
— ^^Zi w i^ ( x ~ x i)^ with atomic masses located at 
O0i<;<5 = (-2,-1,0,1,2) and weights (^ i ) 1 < i < 5 = 
(1/9,2/9,3/9,2/9,1/9). The sequence of moments of fig 
is (m k ) 1<i<5 = (0,4/3,0,4,0). Let us define F (a) = 
J-oq (x)~dx, i.e., the cumulative distribution of fig (x), and 
denote by Zp, (ck) the numerical solution to the dual SOSP 
m (B) for T = ( —oo, a]. According to dl4l, Z D (a) is an 
upper bound of F (a) for all values of a. In Fig. 4, we verify 
this result by plotting the cumulative distribution F (a) and 
Z D (a) for a = [-5:0.25:3]. 

B. Solution to Problem 

In this subsection, we derive bounds on the smallest and 
largest eigenvalues of Ag from only the knowledge of a 
truncated sequence of spectral moments. For this purpose, we 
apply the technique proposed in [24 ] to compute the smallest 
interval [a, b] containing the supporj^] of a positive Borel 
measure fi from its complete sequence of moments (m r ) r>0 . 

6 Recall that the support of a finite Borel measure /i on R, denoted by 
supp (/i), is the smallest closed set B such that fi (R\B) = 0. 



In 12411 a technique was also proposed to compute tight bounds 
on the values of a and b when only a truncated sequence 
of moments (m r ) 0<r<k * s known- m me context of spectral 
graph theory, we can apply this technique to a sequence of 
spectral moments in order to bound the support of the spectral 
measure fig of a graph Q. In this context, the extreme values, 
a and b, of the smallest interval containing the support of fig 
corresponds to the minimum and maximum eigenvalues of Ag , 
denoted by A m i n (Ag) and p (Ag), respectively. Since we can 
compute the first five spectral moments in terms of structural 
properties using the results in Section [In| this technique allows 
to compute bounds on A m i n (Ag) and p(Ag) in terms of 
structural properties. 

We describe the scheme proposed in l24l to compute the 
smallest interval [a, b] by solving a series of SDPs in one 
variable. As we shall show below, at step s of this series of 
SDPs, we are given a sequence of moments (mi, ...,m,2 S +i) 
and solve two SDPs whose solution provides an inner ap- 
proximation [a 3 ,l3 8 ] C [a, b]. As we increase s in this 



series, we obtain two sequences (a s ) sGN and (/3 S ) 



that 



are respectively monotone nonincreasing and nondecreasing, 
and converge to a and b as s — » oo. In our case, we have 
expressions for the first five spectral moments, (mi,..., 7725), 
hence, we can solve the first two steps of the series of SDPs. 
The solutions, a s and /3 3 , of these SDPs provide upper and 
lower bounds on A m i n (Ag) and p (Ag), respectively, in terms 
of structural properties. 

In order to formulate the series of SDPs proposed in l24l 
we need to define the so-called localizing matrix l25l . Given 
a sequence of moments, m^ 2s+1 ^ = (mi, m2 S +i), the 
localizing matrix is a Hankel matrix defined as: 



H s (c) = R 2s +i - c R 



•2s, 



(15) 



where R 2s and i?2s+i are the Hankel matrices of moments 
defined in ( [T2] ). Hence, for a given sequence of moments, the 
entries of H s (c) depend affinely on the variable c. Then, we 
can compute a s and f3 s as follows [ 24 1: 

Proposition 1: Let m( 2s+1 ) = (mi, m2 S +i) be the trun- 
cated sequence of moments of a positive Borel measure fi. 
Then, 

a < a s = max {a : H s (a) 0} , (16) 

a 

b>f3 s ±mm{f3:-H s (f3))?0}, (17) 

for [a, b] being the smallest interval containing supp(p). 

Remark 4.1: Observe that a s and f3 s are the solutions to 
two SDPs in one variable, which can be efficiently solved 
using standard optimization software (for example, CVX (26)). 
Notice that the matrix involved in the semidefinite constrains 
in (16) and (17), H s (x), has size (s + 1) x (s + 1). Hence, the 



computational complexity of solving this SDP is polynomial 
in 5, l24ll . Since s is a small number in our context (i.e., s = 2 
if we use five moments in Proposition 1), the computational 
cost of solving this SDP is negligible in comparison with the 
cost of counting triangles, quadrangles and pentagons in Q, 
which requires £Li (^J 1 )' £ti (^V)' and ELi (^ 2 ) 
operations, respectively. 
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Given a truncated sequence of spectral moments, the small- 
est interval [a, b] becomes [A m i n (Ag) , p (Ag)), thus, Proposi- 
tion [T] provide an efficient numerical scheme to compute the 
bounds a s > A m i n (Ag) and fa < p(Ag). Furthermore, for 
5 = 1 and 2, one can analytically solve the SDPs in ( [T6| ) 
and $FJ\ . For example, in the case s = 1, we are given 
a sequence of three spectral moments, (7711,777,2,7713), with 
localizing matrix 



Hi(c) 



mi — crriQ m 2 — cm\ 

7772 — C777i 7773 — cm 2 



From ([5]), the spectral moments of simple graphs satisfy mo = 
1, mi = 0, 7772 = 2e/n, and 7773 = 6 A/77. One can prove that 
the optimal values of ol\ and fa are the smallest and largest 
root of det H\(c) = 0, which is a second-order polynomial in 
the variable c. Then, we have the following bounds on p (Ag) 
and A m i n (Ag) in terms of the number of nodes 77, edges e, 
and triangles A in Q: 



p(Ag)>fa 
Amin (Ag) < ai 



6A + v/36A 2 + 32e 3 /r 
4~e 



(18) 



6A- ^36 A 2 + 32e 3 /n 
4e~ " 



As mentioned above, tighter bounds on the spectral radius 
can be found as we increase the value of s in Proposition [T] 
In the case s = 2, we are given a sequence of five spectral 
moments (mi, 7772, 7775) with localizing matrix, 



H 2 (c) 



TTli — C 7772 

7772 — Cmi 7773 

7773 — cm 2 7774 



cmi 777,3 — cm 2 
cm 2 777,4 — cms 
cms m 5 — cm 4 



■ (19) 



As we proved in Section III these moments depend on the 
number of nodes, edges, cycles of length 3 to 5, the sum of 
squares of degrees W 2 , and the degree-triangle correlation Cat- 
Since we are using much richer structural information than in 
the case s = 1, we should expect the resulting bounds to 
be substantially tighter (as we shall verify in Section [V]). For 
s = 2, the optimal values a 2 and fa can also be analytically 
computed, as follows. First, note that —H 2 (c) ^ if and only 
if all the eigenvalues of H 2 are nonpositive. The characteristic 
polynomial of H 2 (c) can be written as 

02 (A) = det (XI - H 2 (c)) = A 3 +piA 2 + 77 2 A 

where pj is a polynomial of degree j in the variable c (with 
coefficients depending on the moments). Thus, by Descartes' 
rule, all the eigenvalues of H 2 are nonpositive if and only if 
Pj > 0, for j = 1,2, and 3. In fact, one can prove that the 
optimal values of a 2 and fa in (_16) and (FJ\ can be computed 
as the smallest and largest roots of 773(c) = det H 2 (c) = 0, 
which is a third degree polynomial in the variable c (24). 
Therefore, the expressions that allow us to compute the 
optimal bounds are: 



p(Ag) > fa = max{roots [773(c)]}, 
Amin (Ag) <a 2 = min{roots [773(c)]}, 



(20) 




Fig. 5. Scatter plot of the spectral radius, p{Gi), versus the lower bounds 
Pi (Gi) (crosses) and (Gi) (circles), as well as the random-graph-based 
estimator W (Gi) (squares), where each point is associated with one of the 
100 social subgraphs considered in our experiments. 



where ps (c) — dsc 3 + d 2 c 2 + d\c + do, with 

do = 2m 2 msm/i — m^m\ — m\ + rriirn^rris — mim\, 
d\ = m 2 m\ — m\m^ + mim^m 2 — mi ma 7714 

- m 5 m 3 + m\, 
d 2 = m^rri\m 2 — m^m\ + mim\ — m\ms 

+ m 5 m 2 - m 4 m 3 , 
ds — m^m\ — 2mim 2 ms + m\ — m^m 2 + m\. 

There are closed-form expressions for the roots of this third- 
order polynomial (for example, Cardano's formula (271), 
although the resulting expressions for the roots are rather 
complicated and do not provide much insight. 

In this subsection, we have presented a convex optimization 
framework to compute optimal bounds on the maximum and 
minimum eigenvalues of a graph Q from a truncated sequence 
of its spectral moments. Since we have expressions for spectral 
moments in terms of structural properties, these bounds relate 
the eigenvalues of a graph with its structural properties. 

V. Numerical Simulations and Structural 
Implications 

In this section, we analyze real data from a regional network 
of Facebook that spans 63, 731 users (nodes) connected by 
817,090 friendships (edges) |28|. In order to corroborate our 
results in different network topologies, we extract multiple 
medium- size social subgraphs from the Facebook graph by 
running a Breath-First Search (BFS) around different starting 
nodes. Each BFS induces a social subgraph spanning all nodes 
2 hops away from a starting node. We use this approach 
to generate a set G = {G^}i<ioo of 100 different social 
subgraphs centered around 100 randomly chosen nodes Q 

In our first numerical experiment, we compute the first 
five spectral moments 015 (Gi) = (mi (Gi) , ...,777,5 (Gi)) for 
each social subgraph Gi e G. From these moments, we then 

7 Although this procedure is common in studying large social network , it 
introduces biases that must be considered carefully l3ll . 
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Fig. 6. Histogram of eigenvalues of a social subgraph of Facebook with 
2,404 nodes. 



compute the lower bounds on the spectral radius fa (Gi) and 
fa (Gi) using Proposition [T] Fig. 5 is a scatter plot where 
each cross has coordinates (p(Gi) ,fa (Gi)) and each circle 
has coordinates (p (Gi) , fa (Gi)), for all Gi G G. In the same 
figure, we have also included a cloud of red squares with coor- 
dinates (p(Gi) : W(Gi)), where W (Gi) 4 Ei^/Ei* is 
the estimator based on synthetic random graphs ([2]). Observe 
how spectral radii p(Gi) of these social subgraphs are 
remarkably close to the theoretical lower bound fa (Gi). 
In particular, the correlation coefficient between p(Gi) and 
fa (G^ is equal to 0.995 (while the correlation between p (Gi) 
and the estimator based on random networks, W (Gi), is equal 
to 0.974). Therefore, it is reasonable to use fa (Gi) as an 
estimate of p(Gi) for social subgraphs. 

In what follows, we analyze the spectral moments of online 
social networks to reveal the set of structural properties having 
the highest impact on the spectral radius. Empirical evi- 
dence strongly suggests that many real- world networks present 
heavy-tailed eigenvalue distributions [29],[30|. For example, 
in Fig. 6 we have included the histogram of eigenvalues of 
a subgraph of Facebook with 2,404 nodes, where we can 
observe the following two typical properties in the spectrum 
of online social networks: (i) The largest eigenvalue of the 
network is well separated from the rest of eigenvalues (spectral 
dominance), and (ii) the bulk of eigenvalues concentrates 
around the origin. In this case, we can numerically approx- 
imate high-order moments asm^ (Ag) « ^A^. Furthermore, 
we have from ^ that the fifth spectral moment is equal to 
ra 5 (Ag) = i [10n + lOCdt - 30 A] « ±A? . Therefore, we 
have the following estimator for the spectral radius of online 
social networks: 



Ax « \{ a) = (10n + 10C dt - 30A) 



1/5 



(21) 



For example, for the social subgraph with 2,404 nodes men- 
tioned above, the exact value of the spectral radius is Ai = 
60.9, while the estimator is A^ = 62.6. 

We now use $2l\ to unveil the set of structural properties 
that are most influential on the spectral radius. In Fig. 7, we 
plot (in semilog scale) the values of n, C dt , and A for each one 
of the 100 different social subgraphs, Gi G G, considered in 




40 60 80 100 



Fig. 7. Number of triangles A (green triangles), degree-triangle correlation 
Cdt (red crosses), and number of pentagons II (blue circles) for each one of 
the 100 social subgraphs considered in our experiments (in semilog scale). 



our previous experiment. Observe how the number of triangles 
A is always much smaller than n + Cdt- Therefore, for online 
social networks, we can simplify ( [2T) as follows, 

lOC dt ) 1/5 . 



Ai 



(ion 



Fig. 8 is a scatter plot where each circle has coordinates 
(p(G i ),\f ) (d)), for all G { G G. We observe how our 
approximation presents an excellent performance in practice, 
outperforming the popular estimator, W (Gi), based on ran- 
dom networks (see Fig. 5). 

The estimator A^ provides a clear insight about what 
structural properties have the strongest impact in the spectral 
radius of online social networks. In particular, A^ unveils that 
both the number of pentagons, n, and the degree-triangle cor- 
relation, Cdt, are structural properties with a strong influence 
on the spectral radius. 

The tightness of our bounds depends on the nature of the 
data used. In the following examples, we illustrate the quality 
of our bounds for an Internet and an e-mail network: 

Example 5.1 (Enron e-mail network): In this example we 
study the spectral properties of a subgraph of the Enron e- 
mail network (32). In this network, nodes correspond to e- 
mail addresses and an edge exists if i sent at least one 
e-mail to j (or vice versa). The subgraph under study has 
n — 3, 215 nodes, e = 36, 537 edges, and its largest eigenvalue 
is Ai = 95.18. Using the results in Section [Tll| we compute 



the first five spectral moments of the adjacency matrix to be: 
mi = 0, m 2 = 22.47, ra 3 = 394.7, ra 4 = 33,491, and 
m 5 = 2, 603, 200. From Proposition [T] we obtain the following 
lower bound on the largest eigenvalue: fa = 78.53 < Ai. 
We can also compare our bound with the estimator in ([2]), 
corresponding to a random network with the same degree dis- 
tribution. The value of the estimator is equal to Ai = 124.57. 

Example 5.2 (AS-Skitter Internet network): We now con- 
sider a subgraph of the Internet network at the Autonomous 
Systems (AS) level, which was obtained from the Skitter data 
collection in CAIDA (33J. Our subgraph has n = 2,248 
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spectral moments. We have also introduced an optimization 
framework that allows us to extract optimal bounds on spectral 
properties of interest using semidefinite programming. 

Using our approach, we have unveiled those structural 
properties that have the strongest impact on the spectral 
properties of a collection of social subgraphs. In particular, 
we have found that the number of close cycles of lengths 
3 to 5 (quantified by A, Q and II), as well as the sum 
and sum-of- squares of the degrees, and the degree-triangle 
correlation Cdt have a direct influence on the eigenvalue 
spectrum. Furthermore, in the case of online social networks, 
we have found that the number of pentagons and the degree- 
triangle correlation strongly influence the spectral radius of 
the network. 



Fig. 8. Scatter plot of the spectral radius, p(G{), versus the spectral 
estimator (Gi), where each circle is associated with one of the 100 
social subgraphs considered in our experiments. 



nodes, e = 20,648 edges, and its largest eigenvalue at 
Ai = 91.3. The spectral moments of its adjacency matrix 
are m x = 0, m 2 = 18.37, ra 3 = 341.1, ra 4 = 40,001, 
and m 5 = 2, 777, 018, and the resulting lower bound is 
/?2 = 74.72 < Ai. In this case, the estimator based on random 
networks produces a value of Ai = 219.1, which is very loose. 
Therefore, estimators based on random networks can be very 
misleading in the analysis of the Internet graph. 

In this section, we have first shown that /?2 (Gi) can be used 
as an estimator of the spectral radius p(Gi) for online social 
subgraphs, outperforming the estimator based on random net- 
works. Furthermore, we have analyzed the spectral moments of 
online social networks to unveil the set of structural properties 
having the highest impact on the spectral radius. In particular, 
we have found that the number of pentagons and the degree- 
triangle correlation strongly influence the spectral radius of 
online social networks. 

VI. Conclusions 

A fundamental question in the field of network science is to 
understand the relationship between the structural properties 
of a network and its dynamical performance. The common 
approach to study this relationship is to use synthetic network 
models. Although very common, synthetic models present 
some major flaws: (/) These models are only suitable to study a 
very limited range of structural properties, and (ii) they implic- 
itly induce structural properties that are not directly controlled 
and can influence the network dynamical performance. 

In this paper, we have proposed an alternative approach 
to study the relationship between a network structure and its 
dynamics that is not based on synthetic models. Our approach 
exploits the closed connection between the dynamical perfor- 
mance of many dynamical processes that can take place in a 
network and its eigenvalue spectrum. Consequently, we have 
studied how structural properties of a network relate to its 
eigenvalue spectrum using algebraic graph theory and convex 
optimization. In particular, we have derived expressions that 
explicitly relate structural properties of a network with its 



Appendix 



Lemma [33| Let Q be a simple graph. Denote by p i9 ti, and 
di the number of pentagons, triangles, and edges touching node 
i in Q, respectively. Then, 



ra 5 (A g ) 



1 



Proof: As in Lemma [33} we count the number of closed 
walks of length 5 in Q. We classify these walks based on the 
structure of the subgraph underlying each walk. We provide 
a classification of the walk types in Fig. 2, where we also 
include expressions for the number of closed walks of each 
type. We now provide the details on how to compute those 
expressions for each walk type: 

(a) The number of closed walks of this type starting at i 
is equal to twice the number of pentagons touching 
i, hence, the total number is given by Y^i=i ^Pi- 

(b) In order to count walks of this type, it is convenient 
to define t pqr as the indicator function that takes 
value 1 if there exists a triangle connecting vertices 
p, q, and r (0, otherwise). Note that this indicator 

satisfies J2q=iYlr=i tpqr = 2t p , wnere tp is me 
number of triangles touching node p. Hence, the 
number of closed walks of type (b) can be written 
as: 



(*0 



y^ y^ y^ y^ ^p^r ? 

i=l p=l q^i r^i 



where ai p indicates the existence of an edge from i 
to p, and t pqr indicates the existence of a triangle 
connecting q and r with p. We can then perform the 
following algebraic manipulations, 



, 0) W 



(ii) 



n n n 



p—1 q—1 r—1 \i^q^ r 

n n n 

y^ (dp ~ 2) tpqr 
p=l 9=1 r=l 



= 2j2t P (d p -2) : 
P =i 
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(c) 



where in equality (i) we have changed the order of 
the subindices, and impose the inequality constrains 
on subindex i. In equality (ii), we take into account 
that Xli^g r ai P = di — 2, since p is connected to q 
and r in this walk type. 

We can use the indicator function to write the 
total number of walks in this type as follows, 



,(c) 



2 EEE^ ( d i- 2 ) 



2=1 j = l fc = l 

n 

= 4$^(d i -2)t i , 

where the last expression comes from reordering the 
summations and Y^=i Ylk=i Ujk — 2tj. 

(d) The number of walks starting at i in this type is equal 
to 4U (di — 2), where we have included a —2 in the 
parenthesis to take into account that two of the edges 
touching i are part of the triangle. The coefficient 4 
accounts for the two possible direction we can walk 
the triangle and the two possible choices for the first 
step of the walk (towards the triangle or towards the 
single edge). 

(e) -(f) These types of walks correspond to the set of closed 

walks of length 5 that visit all (and only) the edges 
of a triangle. Given a particular triangle touching i, 
we can count the number of walks of this type to be 
equal to 10, where 8 of them are of type (e) and 2 
of type (f). 

Hence, we obtain ^ by summing up all the above contri- 
butions (and simple algebraic manipulations). ■ 
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